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Abstract 

We compute the mixing rate of a non-backtracking random walk on a regular expander. 
Using some properties of Chebyshev polynomials of the second kind, we show that this rate 
may be up to twice as fast as the mixing rate of the simple random walk. The closer the 
expander is to a Ramanujan graph, the higher the ratio between the above two mixing rates is. 

As an application, we show that if G is a high-girth regular expander on n vertices, then 
a typical non-backtracking random walk of length n on G does not visit a vertex more than 
(1 + o(l)) lo jfiog n times, and this result is tight. In this sense, the multi-set of visited vertices is 
analogous to the result of throwing n balls to n bins uniformly, in contrast to the simple random 
walk on G, which almost surely visits some vertex f2(logn) times. 



1 Introduction 

1.1 Background and definitions 

Let G = (V, E) be an undirected graph. A random walk of length k on G, from some given vertex 
wq £ V, is a uniformly chosen member of: 

W (fc) = {(w ,wi, ...,w k ) : w t £V 1 wt-iwt £ E for all t £ [k]} . 

Equivalently, such a walk is a finite Markov chain Ai = (Xq, . . . , X^) on the state space V, where 
Xo = wo and the transition probabilities are P uv = Pr[Xj = v | -Xj_i = u] = l{ uv eE}/ deg(u). For 
further information on Markov chains, see, e.g, jlUj . 
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The extensive study of random walks on graphs was motivated by the following useful property, 
which we first state informally. While the random walk is simple to analyze and to implement in 
many frameworks, it "mixes" in G after a relatively small number of steps, provided G satisfies 
some natural requirements. Thus, the random walk provides an efficient method of sampling the 
graph vertices, a fact which has many applications in Theoretical and Applied Computer Science. 
See ^Sl for a survey on the subject. 

The following facts are well known (see, e.g., ^3], [Tl], JH])- If G is a connected and non- 
bipartite undirected graph, then the Markov chain M, corresponding to the random walk on G, 
is irreducible and aperiodic. In this case, A4 converges to a unique stationary distribution, tt, 
regardless of its starting position, where ir(u) = ^fffi • The m ^ n 9 rQ ^ e of the random walk on G 
measures how fast M converges to the stationary distribution, and is defined as follows: 



where Puv' = PrpQ+fc = v \ Xt = u]. The notion of mixing time, the number of steps it takes M to 
get "sufficiently close" to w, has several commonly used definitions, and for each of these definitions 

(k) 

there are lower and upper bounds as a function of p and n. For instance, letting P« denote the 
distribution of A4 at time k given that Xq = u, one may define the mixing time t £ as the minimal 

!k) 

number of steps it takes Pu and it to be at most e-far in terms of their total variation distance, 
maximized over all vertices u £ V. 

An important special case of the above is the one where the graph G is regular. In this case, the 
stationary distribution tt is the uniform distribution, being an eigenvector of the transition probabil- 
ities matrix P = A/d (where A is the adjacency matrix of the graph and d is its regularity degree). 
Hence, whenever G is connected and non-bipartite, the random walk eventually approximates the 
uniform distribution. As we next specify, these sufficient and necessary conditions, required for the 
random walk on G to mix, are determined by the spectrum of G. 

Let G be a d-regular graph. The eigenvalues of G, that is, the eigenvalues of its (symmetric) 
adjacency matrix are d = Ai > A2 > • • • > A n , and |A»| < d for all i (by the Perron- Frobenius 
Theorem). The multiplicity of the eigenvalue d is equal to the number of connected components of 
G, and A n = —d iff G is bipartite (proofs of these well known facts can be found, for instance, in [7j). 
Therefore, whenever G is cf-regular, the conditions that G should be connected and non-bipartite 
become equivalent to requiring that A = max{A2, |A n |} would satisfy A < d. Define the following: 

Definition. An (n,d, X)-graph, for some integer d and some A < d, is a d-regular graph on n 
vertices whose second largest eigenvalue in absolute value is A. 

This notion was introduced by the first author in the 80's, motivated by the fact that if A is 
much smaller than d, then the graph has strong pseudo-random properties. We mention a few of 
the properties of these graphs, and refer the readers to for an extensive survey of the subject. 
Let G = (V, E) denote an (n, d, A)-graph. First, the behavior of G resembles that of a random 



P = P{G) 



limsup max 

i„ u.vGV 
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graph of edge density d/n in the following sense: if A, B are (not necessarily disjoint) subsets of 
vertices, then \e(A, B) — ^\A\\B\\ < Xy^\A\\B\ , where e(A, B) denotes the number of ordered pairs 
{(a, b) : a £ A,b £ B,ab £ E} (see (3], Corollary 9.25). In other words, every two sets of vertices 
A, B have roughly the "right" number of edges between them. Second, the expansion property of 
G is closely related to the eigenvalue gap d — A, as stated next. Defining the vertex boundary of 
X, 5X, as the set of neighbors of X in V \ X, it is known that \SX\ > \ X\ for all sets X of 

size at most n/2 (2). Conversely, if 

77 

\SX\ > c\X\ for some c> and all X C V , \X\ < - , (2) 

then d — A > c 2 /(4 + 2c 2 ), implying a discrete version of Cheeger's inequality (PQ). A graph 
satisfying @ with c bounded away from is commonly referred to as an expander, and according 
to this definition (n, d, A)-graphs with A bounded away from d and regular expanders are very close 
notions. 

In many applications of random walks on expanders, there is not much sense in allowing the walk 
to backtrack, besides making the model easier to understand and to analyze. A non-backtracking 
random walk on an undirected graph G, is a walk which does not traverse the same edge twice in a 
row. In the first part of this paper, we determine the mixing rate of non-backtracking random walks 
on expanders, using some properties of Chebyshev polynomials of the second kind (the connection 
between these polynomials and non-backtracking walks follows ideas from |15U12| ). We obtain that 
for 3 < d < n°W, the mixing-rate of a non-backtracking random walk on an (n, d, A)-graph is at 
most the mixing-rate of a simple random walk on the same graph. In fact, the ratio between the 
two may reach up to 2 ^ d d — , as formulated in the next Subsection. 

Let G be a d-regular expander. The following definition of the mixing-time of a random walk 
on G corresponds to an distance of as well as to a relative pointwise distance (r.p.d.) of ^, 

(k) 

between tt and Pu , for all u E V: 

T = T (G) = min j - - < — for all u, v G V and k > t \ . (3) 
t [ n 2n J 

As G is a regular expander, r = 0(logn). Notice that sampling the position of the random walk 
A4 at time-points, which are at least T-apart, gives a more or less independent and uniformly 
distributed set of vertices. On the other hand, a set of vertices sampled at constant-distance 
time-points is clearly very much dependent. As we next show, there is a special interest in the 
distribution of the set of vertices along 0(n) consecutive steps of the random walk. 

An example of this is the amplification of randomized algorithms (such as the Rabin-Miller 
primality testing algorithm). Let A denote such an algorithm which uses logn random bits; the 
naive parallel repetition of A spends G(nlogn) bits in order to reduce the error probability ex- 
ponentially in n. It is well known that the probability that a random walk of length k avoids a 
given set of vertices of constant proportion, decreases exponentially with k (see, e.g., [3], Corollary 
9.28). Therefore, if G is a regular expander of fixed degree, feeding the positions of a random walk 
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of length 0(n) on G as the random seeds for the algorithm, reduces the error probability of the 
algorithm exponentially, using only 0(n) random bits. 

In the above application of conserving randomness when amplifying randomized algorithms, 
our concern was the probability that a random walk of length n misses a large given set of vertices. 
Instead, in load balancing applications, the concern is the maximal number of times that a random 
walk of length n visits a vertex. This corresponds to the classical balls and bins paradigm (see, 
e.g, (HI, 0), which discusses the result of throwing n balls to n bins, independently and uniformly 
at random. In the balls and bins experiment, the bin with the largest number of balls typically 



As we later show, the random walk is unsuitable for conserving randomness in this case, as 
a typical random walk of length n has a maximal load of O(logn). As an application for non- 
backtracking random walks, we show here that the maximal number of times that such a walk of 
length n visits a vertex, is (1 + o(l)) log n/ log log n times on high girth expanders with n vertices. 

Throughout the paper, we say that an event, which is defined for an infinite series of graphs, 
occurs with high probability, or almost surely, or that almost every graph of an infinite series of 
graphs satisfies some property, if the probability for the corresponding event tends to 1 as the 
number of vertices tends to infinity. Unless stated otherwise, all logarithms are in the natural 
basis. 

1.2 Main results 

Let G = (V, E) denote an undirected graph. Define a non-backtracking random walk of length k on 
G, from some given vertex wq G V, as a uniformly chosen member of: 



Equivalently, a non-backtracking random walk on G from wq is a finite Markov chain A4, whose state 
space is E, the set of directed edges of G, taking each edge in both orientations. The distribution 
of the initial state is given by Pr[A~o = (Vci)'")] = l{w ueE}/ deg(wo) (and elsewhere), and the 
transition probabilities are P(u, v ),(v,w) = l{u^«>}/(deg(t> ) — 1) (and elsewhere). If G is d-regular, 
then the transition probabilities matrix is double-stochastic, hence the uniform distribution is a 
stationary distribution of M. Notice that if G is 2-regular, then it is a disjoint union of cycles, 
hence a non-backtracking random walk on G is periodic and does not converge to a stationary 
distribution. We therefore require that d > 3, in addition to the requirements that G should 
be connected and non-bipartite, and these necessary conditions prove to be sufficient for A4 to 
converge to the uniform distribution. 

Let G = (V,E) denote an (n, d, A)-graph for d > 3. Recalling (^Q), define the mixing rate of a 



contains (1 + o(l)) 



log n 



balls (see 0). 



log log n 
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(b) 10-regular graphs 



Figure 1: Mixing rates of simple and non-backtracking random walks on regular graphs. 



non-backtracking random walk on G as: 



p(G) = limsup max 

k^oo u,v£V 



p(*) _ I 
uv n 



l/k 



(4) 



"(k) 

where Pu V is the probability that a non-backtracking random walk of length k on G, which starts 
in u, ends in v. The following theorem, proved in Section |2l determines the value of p in this case: 

Theorem 1.1. Let d > 3 denote some integer, and let G be an (n,d, X)-graph for some A < d. 
Define if) : [0, oo) -> E 



x + \/x 2 — 1 7/ x > 1 , 
1 7/ < x < 1 . 



(5) 



Then a non-backtracking random walk on G converges to the uniform distribution, and its mixing 
rate, p, satisfies: 



A 



2y/d=T 



/Vd^l 



(6) 



It is well known (see, e.g., ^S]), that if G is an (n, d, A)-graph, then the mixing-rate of the 
simple random walk on G is p = X/d. As we state in Section |51 combining this with the properties 
of the function ip, defined in (JSJ), gives the inequality p < p, provided d < n oi - l \ The closer A is to 
2\/d — 1 (that is, the closer the graph is to being Ramanujan), the closer the ratio p/p is to " 
as demonstrated in Figure ^ This is formulated in the following corollary: 



2(d-l) ' 



Corollary 1.2. Let G be a non-bipartite and connected d-regular graph on n vertices, for some 
d>3, and let p and p denote the mixing rates of simple and non-backtracking random walks on G, 
respectively. The following holds: let A be the second largest eigenvalue of G in absolute value. If 
A > 2^/d- 1, then 

dp 

< - < 1 . (7) 

2(d-l)-p- {) 

If X < 2\J d — 1 and d = , then p/p = 2 (f-i) ~^~ (^-)> where the o{X)-term tends to as n — > oo. 
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In Section |3J we discuss the maximal load of a set of vertices along n consecutive positions of a 
non-backtracking random walk. The next theorem states that the maximal number of times that 
such a walk on a regular expander of high girth visits a vertex is equal to (1 + 0(1)) ^°^^ , precisely 
the maximal load in the balls and bins experiment. 

Theorem 1.3. Let G be an (re, d, A) graph for some fixed d > 3 and some fixed A < d, whose girth 
is g > 101og d _ 1 logn. With high probability, the maximal number of times that a non-backtracking 
random walk of length re on G visits a vertex is equal to (1 + °(l)) i |Tiog ri - 

Furthermore, the above requirement on the girth is essentially tight: in Section|3]we show that, 
for all g = g(n), there are graphs as described in Theorem 11.31 with girth g, for which the above 
maximal number of visits is ^(-^p) almost surely. 

The final section, Section 21 is devoted to several open problems, further related to random 
walks on expanders and to similar notions of conserving randomness. 

2 The mixing rate of a non-backtracking random walk 

Proof of Theorem ll.il We begin with some preliminaries on Chebyshev polynomials; for further 
information, see, e.g., [201. The Chebyshev polynomials of the second kind, of degree k > 0, are 
the following polynomials: 

sin ((A; + 1)9) , . 

SVO.V 

Also, it is convenient to define U-\(x) = 0. The Chebyshev polynomials satisfy the following 
three-term recurrence relation: 

U k+1 (x) = 2xU k {x) - C4_i(x) , for all k > , (9) 

and are orthogonal with respect to the Wigner semicircle measure da{x) = —y/l — x 2 \s\j\(x)dx. 
Let A = A(G) denote the adjacency matrix of G, and define the n x n matrix for k > 1: 

Ag = |w£>|for allu,veV. 

That is, the entry of A^ at indices u, v is equal to the number of non-backtracking walks of length 
k from u to v. By definition, the matrices A^ satisfy the following recurrence relation: 

f AW = A , A< 2 ) = A 2 - dl , 

\ A( fc+1 ) = AA^ - {d - l)^- 1 ) for k = 2, 3, ... . 

where the last term above, (d — l)j4( fc_1 ), eliminates the walks which backtrack in the k + 1 step. 

We claim that: 

A^ = ^d(d-l) k -^q k (jj=\ forallfc>l, (11) 
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where: 



d-1 



U k (x) 



Vd(d - 1) 



Uk-2( x ) f° r all fc > 1 . 



(12) 



To see this, let f(A, fc) = yjd{d- l)^ 1 ^ (A/(2y/d- 1)) denote the right hand side of (JTTJ). Sub- 
stituting the polynomials U-%(x) = 0, Uq(x) = 1, U\{x) = 2x and ^(x) = 4x 2 — 1 in (fT2*)l implies 
that /(A, 1) = A = and that f(A, 2) = A 2 - dl = A^ 2 \ confirming (JTTJ) for k = 1,2. In order 
to verify that (|11|) holds for all A; > 3, recall that qk(x) is a linear combination of the polynomials 
Uk-2 and Uk, hence it satisfies the recurrence @: 

?fc+i( s ) = 2x Qk{%) ~ Qk-i(x) for all fc > 2. 
Therefore, by induction, the following holds for all fc > 2: 

/(A, fc + 1) = y/d(d-l)kq k+1 



\jd{d-l) k 



A 



A 



.Vd^T \2Vd^T 
AAW - (d - l)^* -1 ) = A( fc+1 ) , 



?k-i 



2^/I^T 



where the last inequality is by (jlUj) . 

Remark 2.1: One can verify that the polynomials qt{x) are orthogonal polynomials with respect 

_ _ _ T ^ , , . 2d(d — 1) vl — x 2 , . 

to the Kesten-McKay measure daix) = -5 — — ^-lr_ 1 i]ix)dx. 



7T d 2 - 4(d 



Take fc > 1, and recall that is the number of non-backtracking walks of length fc from u to 

A (k) 



v. Normalizing the matrix A^ k ' as follows: 

p(fc) 



d(d-l) 



jfc-i 



(13) 



we obtain that P® j s precisely the transition probability matrix of a non-backtracking random 
walk of length fc. Let \x\ = 1, fi2, ■ ■ ■ , denote the eigenvalues of P^ k \ and let 



H = //(fc) = max{|/i 2 |, • • • , |Mn|} • 



Claim 2.2. Xei P ■ and //(fc) 6e as above. The following holds: 



13 



< max 

n i,j 



P 



(fc) 



1 



< //(fc) • 



(14) 



(15) 



Proof. The vector i> 1 = . . . , 1) is an eigenvector of PW corresponding to its largest eigenvalue 

//1 = 1, and therefore: 



max 



5(fc) 


1 


= max 




n 





p(*0 



< max 
|«|=|«l=i 



p(fe) 



V\ V\ ) u,v 
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On the other hand: 



max 

i,3 



p(fe) 


1 






n 






LI E A>m/n- 



2<s<n 



We deduce that: 



p = limsup//(/c) 1//A: = max limsup |//j(/c)| 1 / fc , (16) 

fc^oo 2<i<n k^co 



and it remains to compute the right hand side above. By Q11JI and (|13|) . the following holds for all 
% € [n]: 

_ 1 / 

W " Vd(d-l) fc - l9 ^2V?^T. 

where A, are the eigenvalues of A. Therefore, the proof of the theorem will follow from the next 
lemma: 

Lemma 2.3. The polynomials q^, defined in (|12j) . satisfy: 



lim sup \ qk{x, 

fe^oo 



ll/fc 



1, -1 < a; < 1 , 

|x| + Vx 2 - l, ieR\[-i,i] 



Proof. If x £ [—1,1], then x = cos# for some 9 £ [0, 7r], and hence: 



, /d-l sin((fc + l)e) 1 S in((fc-l)fl) 



Therefore: 



taWlW^(* + i) + ^j(*-i) 



and limsup^^ l^fcl^)) 1 ^ < 1- The reverse inequality follows from an appropriate subsequence kj 
for which the right hand side of (|17|) is bounded from below by some c = c(9) > 0. 

It remains to treat x £ [—1, 1]. In this case, x = (z+z~ 1 )/2 for z = x+sign(x)Vx 2 — 1 ^ [—1, 1]. 
Setting z = sign(x)e 6 ' for some real 9, we get x = sign(x) cos(i9), and therefore: 



q k (x) = sign(x)' 



d - 1 sin((Jfe + l)i6) 1 sin((Jfe - l)i6) 

d sin(i0) ^Jd{d - 1) sin(i0) 



d _ i _ z -(fc+i) i _ z -(fc-i) 

d z-z- 1 ^(d - 1) z-z- 1 

and Umsup |g , jfc(x)| 1 /' ! = lim \qk{x)\ l l k = \z\. 

This completes the proof of the lemma and of Theorem 11.11 
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Proof of Corollary 11.21 Let A denote the largest absolute value of a nontrivial eigenvalue of G. 
Note that "4>(x), as defined in Theorem II, 11 satisfies the following properties: 

tp is strictly monotone increasing on [1, oo) , tp(l) = 1 ° 



^(^fc) x + \/x 2 - 4d + 4 



< - for all d and all 2 v / d Tr I < x < d , ( 18 ) 

Therefore, if A > 2y/d — 1, Theorem 11.11 implies that p = tp (^ 2 ^zr_ N /V^ — L and that: 

A „ A 



2(d- 1) r " d ' 

As p = A/d, we obtain (JJJ). Furthermore, as A decreases to 2y/d — 1, ^ ( 2 Vtf-i ) ^ enc ^ s ^° 1> implying 
that p -» ^J=, and p/p -» ^TJ- 

It remains to handle the case A < 2Vd- 1. To this end, recall the following result of Nilli [T7] . 
which implies the Alon-Boppana Theorem: 

Theorem 2.4 (|17j). If G is a simple undirected d-regular graph with diameter at least 2(k + 1), 
then the second largest eigenvalue of G, A2, satisfies A2 > 2\/d — 1 — j" 1 . 

As the diameter of a d-regular graph on n vertices is at least (1 — o(l)) log d _ 1 n, we deduce that 
in the above case, if d = n°W then A = (1 - o{l))2^fd^T. In this case, by Theorem ll.il we have 
p= 1/Vd-l, and p/p = 2{^T)+ ( 1 )- ■ 

Remark 2.5: Examining the trace of the square of the adjacency matrix of a graph, it is easy to 
see that for every d-regular graph on n vertices, the second largest eigenvalue in absolute value is 
at least J ■ It thus follows that if d = o(n) then p < (1 + o(l))p. 



(n-l) 

Remark 2.6: For d-regular graphs with d = 0(n) the mixing rate of the simple random walk 
may indeed be faster than that of the non-backtracking random walk. For instance, if G is the 



complete graph on n vertices, K n , then by Theorem ll.il p = , and p — 1 



3 Random walks and the balls and bins paradigm 



Proof of Theorem 11.31 Let G be as described in Theorem 11.31 The following definition of 
the mixing-time of a non-backtracking random walk on G corresponds to an distance of 1/n 2 

~(k) 

between tt and P„ , for all u E V: 



t = mm 

t 



p{k) 



n 



< — ^ for all u, v £ V and k > t 



n 



(19) 
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Theorem 11.11 implies that a non-backtracking random walk on G converges to the uniform distri- 
bution at a mixing-rate of p = ip ( —h ) j\J d — 1, and we deduce that r = O(logn) (by usual 



,2-vAZ-l. 

arguments linking the mixing-rate to the mixing-time). 

The proof of Theorem II . 31 will follow from the next two lemmas, which we prove using first and 
second moment arguments (see, e.g., combined with some additional ideas. 

Lemma 3.1. Let G be as in Theorem ll.cil With high probability, a non-backtracking random walk 
of length n on G does not visit a vertex more than (1 + o(l)) ^°f^ n times. 

Lemma 3.2. Let G be as in Theorem M.HA With high probability, a non-backtracking random walk 
of length n on G visits some vertex at least (1 + o(l)) ^f g n times. 

The key element in the proofs of both lemmas is showing that the number of times that a 
non-backtracking random walk visits some vertex, or some pair of vertices, is governed by visits at 
locations which are at least r apart. This implies a behavior which is essentially the same as the 
one in the balls and bins experiment. 

Proof of Lemma 13.11 Let u, v S V denote two vertices, so that either u = v or the distance 
between u and v in G is at least 

L = 101og (i _ 1 logn , (20) 

~(£) 

and let Pu V denote the probability that a non-backtracking random walk of length t on G, which 
starts at u, ends in v. We claim that: 

SM< J (rf-l)/0og^) 5 If£<r, 
uv ~ \ (l + O/n If£>r. [ ' 

The case £ > r follows directly from the definition (|19fl of the mixing time r. For the case I < r, 
let W = (u = wq,wi, . . . ,we) denote a non-backtracking random walk of length I on G, starting 
at u. The choice of u, v and the fact that g, the girth of G, is at least L (this applies to the case 
u = v ), imply that there is no non-empty path between u, v of length shorter than L. Therefore, if 
£ < L then Prfu;^ = v] = 0. Otherwise, let h = [^^-\ , and notice that the neighborhood of v up to 
distance h is precisely a (i-regular tree (as L < g). Let U denote the d(d— l)' 1-1 leaves of this tree. 
Since the random walk W cannot backtrack, the event we = v implies that € U, hence: 

= Pi[ W£ = V ]= Pl[lV£ = V | W £ -h G U] Pl[W£- h £ U] 

< Pv[w e = v | wt- h £U] = (d- l)- h < 

(log n. 
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Let e > 0, and set k = (1 + £ ) \ogf g n - Consider a non-backtracking random walk of length n on 
G, W = (wq,wx, . . . ,w n ), where wo is a fixed vertex of V. For each vertex v £ V, and for each 
t £ {0, . . . , k}, define the following event: 



A 



v,t 



W visits v at least k times at some indices l<*i <...<«&<... 
and \{j E [k - 1] : ij+i - ij < t}\ = t . 
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That is, A v t describes the event in which precisely t of the first k — 1 segments of W, which are 
bounded by consecutive visits to v, are of length smaller than r. Considering all the possible ways 
to choose indices io, . . . , according to the definition of A v j, we derive the following from ()21|) : 

For < t < k — 1, replacing t by t + 1 in the right hand side of (|22|) results in a multiplicative 
factor of: 

(k-t)(k-t- 1) rn d - 1 _ / k 2 r \ yl 



(n-k + t + l)(t + 1) 1 + n- 1 (logn) 5 VOog^ 5 

J t=0 



Therefore, the largest term is obtained for t = 0. Letting ^ = U^ = q A v j denote the event that W 



visits the vertex v at least k times, we get: 



n\ f2\ k 2 k 



Pr l^ s4 WUJ - JTW = 0(1/ ' !) ' 

where the last inequality is by the assumption on k. Therefore, Pi^U^gy^] = o(l), and with high 
probability, W does not visit any vertex of V more than k times. ■ 

Proof of Lemma EOl Let e > 0, and set k = - £) lo lo 1 g o " ra l . Let W denote a non-backtracking 
random walk of length n on G, W = (wo,wi, . . . ,w n ), where wq is a fixed vertex of V. We wish 
to show that, with high probability, W visits some vertex v G V at least fc times. We will show 
that, in fact, this statement holds even if we restrict ourselves to a predefined subset of the vertices 
U C V, and in addition, restrict the pattern of the visiting locations. 

Let U C V denote a set of vertices of G of size 

|C/| = rn/(d(logn) 10 )l , (23) 

so that the distance between any pair of vertices u,v £ U is at least L = lOlog^^ logn (as defined 
in (J20J)). To see that such a set U indeed exists, notice that the number of vertices, whose distance 
from some u G U is at most L — l, does not exceed Yli=o d(d — I) 1 < d(d — 1) L . Therefore, a greedy 
algorithm which begins with an empty set, and repeatedly adds a new legal vertex to U, always 
succeeds in producing a set of size at least n/ (d(logn) 10 ). 

The restriction we impose on the pattern of visits is defined next: 

Definition. Let T C (^) denote a set of k indices in [n]. We say that T is a fc-pattern iff 
T n [2t] = and \i — j\ > 2r for all i,j G T. In other words, the value of the elements of T , and 
the pairwise distances between these elements, all exceed It. 

The above definition implies that, if T is a /c-pattern, then for all i E [n], there is at most one 
element j 6 T so that \i — j\ < r. This makes it useful to define the correlation between fc-patterns 
as follows: 
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Definition. Let Ti andT2 denote two k-patterns. The correlation between T\ andT 2 , 5(Ti,T 2 ), is 
defined as the number of pairs in T\ x T 2 with distance at most t: 

8{T U T 2 ) = I {(a, b) G Ti x T 2 : \a - b\ < r} | . 
Let KL denote the collection of all fc-patterns, and notice that: 

m = ( n -, 2Th ). (24) 



k 

Define the following set of indicator variables for all u G U and T G /C: 

X T -l 1 If Wi = U f ° r a11 * G T ' (25) 
I otherwise. 

In other words, X u ^ is the indicator for the event according to which the non-backtracking walk 
W visits u in all the time-points specified by T. By definition, the first of these time-points exceeds 
r, and the same holds for the distance between each consecutive pair of these time-points, and by 
the definition of r we deduce that: 

1 - n ^ k <?r [ X u , T = l]<( 1 -±^-) k . (26) 



n ) \ n 

Setting X = Yj U t x u,t, we get: 



EA->| C /|("-, 2Tt Vi^-) =„-*), (27) 



k J \ n 

where the last equality is by the definition of k and (j2Hj) . 

In order to show that X is concentrated around its expected value, we consider its second 
moment. Let u, v G U so that u 7^ v, and let t G {0, 1, . . . , k}. Take Ti,T 2 G K, so that 6(Ti, T 2 ) = t. 
By the definition of U, the distance between u and v is at least L. Hence, if \a — b\ < L for some 
(a, b) G T% x T 2 , then the events (^«,Ti = 1) and (^d,t 2 = 1) are disjoint. Otherwise, consider the 
probability of the event (Xu^ = l)/\(X Vt T 2 = !)• By l(2*T|) . the largest of each of the t pairs of indices 
(dj, 6j) G Ti x T2, which satisfy |aj — 6j| < r, contributes a probability of at most (d — l)/(logn) 5 to 
this event. The definition of r implies that each of the remaining indices contributes a probability 
of at most (1 + n~ 1 )/n for visiting the required vertex (either u or v), and altogether: 

l + n-^ 2 ^* / d-1 



ftpr^.lA^-HSl-j-j ■ (28) 



Combining (|2*B|) and ((2*5)) gives: 

Cov(X UjTi ,X v ,t 2 ] 

Tie/c T 2 ex: 

<5(Ti ,T 2 )=t 



< 




a— 1 \ 1 — n 



(log n) 5 y \ n 

1+ „-i)» ftii^iK^y -(i-„-)* 

7 V (logn) b / v ' 
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Let C uv (t) denote the right hand side in the above inequality. Since (1 + n *) 2fc and (1 — n x ) 2fc 
both tend to 1 as n, and hence k, tend to oo, the following holds for all t > 1: 

c m ( t+ i)_ (k-tnir) ^^.offt).,,,!. 



C uv {t) (* + l)(n-(2T + l)fc + t + l) v v " (logn) 5 V( lo g^) 
In particular, for a sufficiently large n we deduce that 

k 

J2C uv (t) <2C W (1) , (29) 
t=i 

and it remains to examine C uv (t) for t £ {0, 1}: 



cwo) 



7X2//-,, _i\ 2k -1\ 2fc N 

n — It k \ I 1 1 + n \ ( 1 — n ■ 



k J \ \ n J \ n 



2 / \ 2k r, /-, , _l\2fe-l /, /wvX 2 

n 



k — 1 J (log n) 5 

EX\ 2 / n \ M 2rfe (1 + o(l))k(d - 1) _ / rfc 2 /EX\ 2 \ 

Vl-^V « 2fc (logn) 5 I (logn) 5 V \U\ J ' 1 ' 



By (EH]), (Enj) and (jSH we get: 

k 

EE E Cov(X u , Tl ,X ?; , T2 )<^^^C™(t) = ((EX) 2 ). (32) 

ne(7i>eC/(Ti,T 2 )e/c 2 «ef/t>e(7t=o 

Next, take u £ U, and consider all /c-patterns Ti 7^ T2 which contain / G {0, . . . , k — 1} common 
indices, and whose correlation, 5(Ti,T2), is some t G {/,... , The following holds: 

2 2 Cov(X u , Tl ,X u , T2 ) 
|TinT 2 |=i 

6(T u T 2 )=t 

Let C u (l,t) denote the final expression of (|33j) . For all / and t so that / < t < k we have: 

C u (Z,t + l) (l + o(l))(A ; -t) 2 2r(d-l) Q f )=o(l) . (34) 



C u (Z,t) t-l + 1 (logn) 5 V( lo g™ 

This implies that the leading order term in the sum Ylt=l 1S C u (l,l). Next, 

C u (l + 1,1 + 1) = (l + o(l))(fc-Q 2 
C u (l,l) l + l 
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hence, if we define: 

Iq = k — 2s/k , l\ = k — -^fk 



then the following holds: 

( a 

C u (l,l) 



C u (i+i,i+i) > 4 + (x) if/</ , 



(35) 



On the other hand, for every I £ [Zo>^i] we have: 

a.( i ,0M 1+ o( 1 ))("-r)C)(";-T)»- 2W 

, NN EX „ m / (EX) 2 \ , . 

=( 1+ °( 1 »w" °(lW^J' <36) 

where the last equality is by (|2*Tj) . We deduce from > and (|3*B|) that for all sufficiently large 
values of n: 

k-l fc fc-i /i 

E E *) ^ 2 E ^ 4C ^0' /o) + 4C u (h,h) + 2 E C tt (Z, I) 

1=0 t=l 1=0 l=l 



of Vk- 



(EX) 2 \ ( (EX) 2 



\U\n £ / 2 J V W\ 



and thus: 



fe-1 k 



EE E cov(x u , Tl ,x u , T2 ) < EEE C «(^) = °(w 2 ) • (37) 

Combining and (|3*7|) (and recalling that EX = uj(1)) gives: 

Var(X) < EX + E E Cov^t^X^tJ = o((EX) 2 ) , 

u,Ti v,T 2 

and Chebyshev's inequality implies that: 

This completes the proof of Lemma 13.21 and of Theorem 11.31 ■ 

We note that the f2 (log log n) requirement on the girth of G in Theorem 11.31 is tight, as there 
are (n, d, A)-graphs with girth g, where a non-backtracking random walk visits some vertex at least 
fi(^ip) times almost surely. This is stated in the next claim. 
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Claim 3.3. Let G be a d-regular graph on n vertices, in which each vertex is contained in a cycle 
of length g = g{n). If k = k(n) satisfies: 

_ logrf-i (rc/logn) -u>(l) 
9 

then, with high probability, a non-backtracking random walk of length n on G visits some vertex at 
least k times. In particular, such a walk almost surely visits some vertex times. 

Proof. For each v £ V, let C v denote a cycle of length g which contains v in G. Let W = 
(wq, wi, . . . , w n ) denote a non-backtracking random walk of length n on G, and divide W into 
T = [n/kg\ disjoint segments, 1%, . . . , It, each of length kg: 

Ij = {w(j-i)kg, Wjkg-i) for all j'eT. 
Define the following event for each j £ [T]: 

I The segment Ij of W is precisely k consecutive walks 

3 ~ \ 

V along the same cycle C v , where v = w^_i^ g . 

To prove the claim, it suffices to show that, with high probability, at least one of the events Aj 
(j G [T]) occurs. Since these events are independent, and PrfA,-] = (d — l)~ kg for all j, we get: 

Pr[nJ =1 4] = (l - (d- 1)-^ < exp(-T{d - l)- fc f) . 
The choice of k ensures that T(d - iy kg = lo(1), and the result follows. ■ 

Remark 3.4: Theorem 11.31 stated that the maximal load in a non-backtracking walk of length 
n on a d-regular expander of high girth is (1 + o(l)) lo ^g„ with high probability, similar to the 
maximal load in the classical balls and bins experiment. In contrast to this, a simple calculation 
shows that a typical simple random walk of length n, on any ci-regular graph for a fixed d, has 
a maximal load of f2(logn). This can be seen as a special case of Claim 1531 taking g = 2: the 
probability that the simple random walk traverses the same edge repeatedly for, say, M = ^ log^n 
consecutive steps, is l/\/n. Dividing the walk to disjoint segments of length M implies that, with 
probability 1 — o(l), at least one segment exhibits this behavior, thus the maximal load is at least 
M. 

Remark 3.5: The classical Birthday Paradox states that, when throwing balls to n bins, inde- 
pendently and uniformly at random, we expect a collision after 0(- v /n) balls (see, e.g., (HI). Relating 
this to random walks on expanders, one may ask when do simple and non-backtracking random 
walks on expanders self- intersect. Clearly, most simple random walks on an expander encounter 
a collision after 0(1) steps (the first time at which an edge is traversed twice in a row). An ar- 
gument similar to the one used in the proof of Claim EP1 shows that, for every small e > 0, there 
are (n, d, A)-graphs with girth g = elog (J _ 1 n, on which a non-backtracking random walk will self 
intersect after at most n £+ °^ steps almost surely. Similarly, for g = o(logn), there are such graphs 
where the self-intersection time of the non-backtracking random walk is at most (d — l)s+°( 1 ). 
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Concluding remarks and open problems 



We have shown that a non-backtracking random walk on every connected and non-bipartite 
d-regular graph G, where d > 3, converges to the uniform distribution, and computed its 
precise mixing-rate. We obtained that this mixing-rate is always asymptotically at least as 
fast as that of the simple random walk on the same graph provided d = o{n) (and is faster 
provided d < re *- 1 **), and their ratio may reach up to 2(d — l)/d. 

As an application, we showed that if G is a high-girth d-regular expander on n vertices, for 
some fixed d > 3, then the maximal load while sampling n consecutive positions of a non- 
backtracking random walk on G is almost surely (1 + o(l)) lo ^g n , similar to the maximal 
load in the classical balls and bins experiment. Performing a simple random walk, instead of 
a non-backtracking one, results in a maximal load of J7(logn) with high probability. 

Following the Poisson approximations in the balls and bins model, it would be interesting to 
establish the precise distribution of a sample of n consecutive positions of a non-backtracking 
random walk on an expander of high girth. 

The well known "power- of -two result ([I], see also |16j . Chapter 14) states that if n balls are 
thrown into n bins, where each ball is placed in the least loaded bin, out of two independently 
chosen random ones, then the maximal load decreases from Q( ^^~ ) to G (log log n). Let 
W\ and W2 denote two non-backtracking random walks on an expander of high girth, and 
suppose that in each step we are given a choice between the two current locations of W\ 
and W2, and pick the least loaded one. Does the maximal load decrease from ®( i g^,g n ) to 
(log log n) in this setting as- well? 

One way of proving the above power-of-two result in the balls and bins model is to consider 
the Erdos-Renyi random graph process i G {0,1,... , Q} (where G° is the empty graph 
on n vertices, and in each step a new edge is added, uniformly chosen over all missing edges; 
see, e.g., (Hj, Chapter 2). Each pair of bins corresponds to a uniformly chosen edge in the 
graph (we may ignore self-loops or repeating edges, as we are dealing with a linear number 
of balls). Selecting a bin corresponds to choosing an orientation for this edge. One can show 
that the greedy online algorithm, which orients an edge towards the vertex with the lower 
in-degree, gives an overall maximal in-degree of O(loglogra) with high probability. This is 
based on the following properties of which hold with high probability for all t < an, where 
< a < ^ is a constant: 

(1) Each connected component of Q l is of logarithmic size. 

(2) For some fixed <5, the average degree of every induced subgraph of Q l is at most 5. 

The above discussion suggests the following approach: let G be a d-regular expander of high 
girth, for some fixed d > 3, and let W\ and Wi denote two non-backtracking random walks on 
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G. Define a random (multi) graph process by adding the edge (W\{t), W2(t)) at step t, where 
Wi(t) is the position of Wj at time t. This can be viewed as a certain de-randomization of the 
random graph process, where the graph at time 0(n) is produced using only 0(n) random bits 
(instead of 0(nlogn) bits). This model, on its own account, seems interesting, with respect 
to the commonly studied questions on graph processes, e.g., whether there exists a sharp 
threshold for the appearance of a giant component. In particular, proving that properties (^Q) 
and ((2j) hold for this graph process for all t < an and some < a < \ will imply a positive 
answer to the previous question, regarding the power-of-two with non-backtracking random 
walks. 
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